Skip to content

chore: migrate from bincode to postcard serialization#5

Merged
mickvandijke merged 6 commits intomainfrom
pr4
Feb 9, 2026
Merged

chore: migrate from bincode to postcard serialization#5
mickvandijke merged 6 commits intomainfrom
pr4

Conversation

@dirvine
Copy link
Copy Markdown
Collaborator

@dirvine dirvine commented Feb 3, 2026

Summary

  • Replace bincode dependency with postcard 1.1
  • Update ChunkMessage encode/decode to use postcard::to_stdvec and from_bytes
  • Remove unused MAX_WIRE_MESSAGE_SIZE constant
  • Update documentation comments to reference postcard
  • Bump version to 0.3.2

Reason

bincode is unmaintained (RUSTSEC-2025-0141). postcard is the actively maintained replacement with similar API.

Test plan

  • All chunk protocol tests pass
  • Builds successfully

🤖 Generated with Claude Code

dirvine and others added 3 commits February 1, 2026 21:42
- Replace bincode dependency with postcard 1.1
- Update ChunkMessage encode/decode to use postcard::to_stdvec and from_bytes
- Remove unused MAX_WIRE_MESSAGE_SIZE constant
- Update documentation comments to reference postcard
- Bump version to 0.3.2
- All chunk protocol tests pass

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 3, 2026 18:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the chunk protocol from bincode to postcard for serialization, addressing the unmaintained bincode dependency (RUSTSEC-2025-0141). The change involves updating serialization calls, removing size limit constants that were specific to bincode, and updating documentation to reflect the new serialization format.

Changes:

  • Replaced bincode with postcard 1.1 dependency in Cargo.toml
  • Updated ChunkMessage encode/decode methods to use postcard API
  • Removed MAX_WIRE_MESSAGE_SIZE constant (bincode-specific)
  • Updated comments and documentation to reference postcard instead of bincode

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
Cargo.toml Bumped version to 0.3.2 and replaced bincode dependency with postcard 1.1
src/ant_protocol/chunk.rs Updated encode/decode to use postcard, removed MAX_WIRE_MESSAGE_SIZE constant
src/ant_protocol/mod.rs Updated documentation to reference postcard serialization
tests/e2e/testnet.rs Updated comments to reference postcard and improved documentation clarity
tests/e2e/integration_tests.rs Refactored message receiver logic and improved timeout comment
src/storage/handler.rs Improved error handling for data size overflow
src/storage/disk.rs Added atomic chunk tracking with current_chunks field and related logic
src/client/quantum.rs Added chunk address and content hash validation
src/client/chunk_protocol.rs Improved handling of lagged broadcast receiver errors

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.with_limit(MAX_WIRE_MESSAGE_SIZE)
.allow_trailing_bytes()
.serialize(self)
postcard::to_stdvec(self)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The postcard serialization lacks the explicit size limit that bincode had (MAX_WIRE_MESSAGE_SIZE). This could allow unbounded allocations when deserializing malicious messages. Consider using postcard's to_allocvec with a size limit or validating the input buffer size before deserialization to prevent DoS attacks.

Copilot uses AI. Check for mistakes.
.with_limit(MAX_WIRE_MESSAGE_SIZE)
.allow_trailing_bytes()
.deserialize(data)
postcard::from_bytes(data)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing input size validation before deserialization. The removed MAX_WIRE_MESSAGE_SIZE constant provided protection against malicious payloads. Add a check to reject input buffers exceeding a reasonable maximum (e.g., MAX_CHUNK_SIZE + 1MB overhead) before calling postcard::from_bytes.

Copilot uses AI. Check for mistakes.
Comment on lines +171 to +174
let mut release_slot = || {
if reserved_slot {
self.current_chunks.fetch_sub(1, Ordering::SeqCst);
reserved_slot = false;
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closure captures self immutably but attempts to mutate the captured reserved_slot variable. While this compiles because reserved_slot is moved into the closure and the closure is FnMut, the logic is fragile. Consider using a guard pattern with Drop implementation or explicitly calling the cleanup in each error path to make the intent clearer and avoid potential issues with closure captures.

Suggested change
let mut release_slot = || {
if reserved_slot {
self.current_chunks.fetch_sub(1, Ordering::SeqCst);
reserved_slot = false;
let release_slot = || {
if reserved_slot {
self.current_chunks.fetch_sub(1, Ordering::SeqCst);

Copilot uses AI. Check for mistakes.
let scan_dir = chunks_dir.clone();
let existing_chunks = spawn_blocking(move || Self::count_existing_chunks(&scan_dir))
.await
.map_err(|e| Error::Storage(format!("Failed to count chunks: {e}")))?
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both map_err calls produce identical error messages 'Failed to count chunks', making it impossible to distinguish between a join error (task panic/cancellation) and an I/O error during counting. The first should indicate a task execution failure and the second an I/O error.

Suggested change
.map_err(|e| Error::Storage(format!("Failed to count chunks: {e}")))?
.map_err(|e| Error::Storage(format!("Chunk counting task failed: {e}")))?

Copilot uses AI. Check for mistakes.
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Feb 3, 2026

Greptile Overview

Greptile Summary

Migrated from unmaintained bincode to postcard serialization (addressing RUSTSEC-2025-0141) and includes several quality improvements to chunk validation and storage accounting.

Key Changes

  • Serialization Migration: Replaced bincode with postcard 1.1 for chunk protocol message encoding/decode
  • Security Enhancement: Added chunk address and content hash validation in quantum.rs to prevent malicious peers from returning incorrect data
  • Storage Improvements: Implemented atomic chunk counting with AtomicU64 to fix capacity race conditions, with proper cleanup on error paths
  • Error Handling: Improved broadcast channel lag handling and overflow protection in size validation

Critical Issues Found

⚠️ Memory Safety Vulnerability: The migration removed MAX_WIRE_MESSAGE_SIZE limit enforcement without replacement. Unlike bincode with .with_limit(), postcard::from_bytes() doesn't enforce size limits by default, allowing a malicious peer to cause unbounded memory allocation via crafted length-prefixed vectors. Both encode() and decode() methods in src/ant_protocol/chunk.rs need size validation added.

Positive Changes

  • The chunk validation additions in quantum.rs (lines 157-187) are excellent security improvements
  • The atomic counting fix in disk.rs properly handles the capacity reservation/cleanup race condition
  • The error handling improvements for broadcast lag are appropriate

Testing

PR indicates all chunk protocol tests pass and builds successfully. However, the removed size limits should be tested with oversized inputs to verify DoS protection.

Confidence Score: 2/5

  • This PR has a critical memory safety issue that must be fixed before merging
  • The removal of MAX_WIRE_MESSAGE_SIZE enforcement creates a DoS vulnerability where malicious peers can trigger unbounded allocations. While postcard is a valid replacement for bincode, the migration didn't preserve the essential size limit protection that prevents memory exhaustion attacks in a P2P network
  • Critical attention needed: src/ant_protocol/chunk.rs (lines 66-67 and 76-77) must add size validation before serialization/deserialization

Important Files Changed

Filename Overview
src/ant_protocol/chunk.rs replaced bincode with postcard but removed critical size limits, allowing unbounded memory allocation attacks
src/client/quantum.rs added chunk address and content hash validation to prevent malicious peers from returning incorrect data
src/storage/disk.rs added atomic chunk counting, improved capacity management with proper cleanup on errors, fixed race conditions
src/storage/handler.rs improved u64-to-usize conversion with proper overflow handling for chunk size validation

Sequence Diagram

sequenceDiagram
    participant Client
    participant ChunkMessage
    participant Postcard
    participant Network
    participant Handler
    participant DiskStorage

    Note over Client,DiskStorage: Chunk PUT Operation (Serialization Migration)

    Client->>ChunkMessage: create PutRequest(address, content)
    ChunkMessage->>Postcard: encode() → to_stdvec(self)
    Note over Postcard: ⚠️ No size limit enforcement<br/>(removed MAX_WIRE_MESSAGE_SIZE)
    Postcard-->>ChunkMessage: Vec<u8>
    ChunkMessage-->>Client: encoded bytes
    
    Client->>Network: send_message(peer, bytes)
    Network->>Handler: receive message bytes
    
    Handler->>ChunkMessage: decode(bytes)
    ChunkMessage->>Postcard: from_bytes(data)
    Note over Postcard: ⚠️ Potential unbounded allocation<br/>if malicious length prefix
    Postcard-->>ChunkMessage: ChunkMessage
    ChunkMessage-->>Handler: decoded message
    
    Handler->>Handler: validate address & content
    Handler->>DiskStorage: put(address, content)
    Note over DiskStorage: Atomic counter prevents<br/>capacity race conditions
    DiskStorage-->>Handler: Result<bool>
    
    Handler->>ChunkMessage: encode PutResponse
    ChunkMessage->>Postcard: to_stdvec(response)
    Postcard-->>Handler: response bytes
    Handler->>Network: send response
Loading

Copy link
Copy Markdown

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 66 to 67
postcard::to_stdvec(self)
.map_err(|e| ProtocolError::SerializationFailed(e.to_string()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postcard doesn't enforce size limits by default - to_stdvec and from_bytes can allocate unbounded memory if a malicious peer sends a large length-prefixed Vec. The removed MAX_WIRE_MESSAGE_SIZE limit prevented this attack.

Add size validation:

Suggested change
postcard::to_stdvec(self)
.map_err(|e| ProtocolError::SerializationFailed(e.to_string()))
let bytes = postcard::to_stdvec(self)
.map_err(|e| ProtocolError::SerializationFailed(e.to_string()))?;
if bytes.len() > MAX_CHUNK_SIZE + 1024 * 1024 {
return Err(ProtocolError::SerializationFailed("Message exceeds size limit".to_string()));
}
Ok(bytes)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ant_protocol/chunk.rs
Line: 66:67

Comment:
postcard doesn't enforce size limits by default - `to_stdvec` and `from_bytes` can allocate unbounded memory if a malicious peer sends a large length-prefixed `Vec`. The removed `MAX_WIRE_MESSAGE_SIZE` limit prevented this attack.

Add size validation:
```suggestion
        let bytes = postcard::to_stdvec(self)
            .map_err(|e| ProtocolError::SerializationFailed(e.to_string()))?;
        if bytes.len() > MAX_CHUNK_SIZE + 1024 * 1024 {
            return Err(ProtocolError::SerializationFailed("Message exceeds size limit".to_string()));
        }
        Ok(bytes)
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 76 to 77
postcard::from_bytes(data)
.map_err(|e| ProtocolError::DeserializationFailed(e.to_string()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postcard deserialization lacks size limits - a malicious peer can send a crafted message claiming a multi-GB Vec allocation, causing DoS.

Add size check before deserialization:

Suggested change
postcard::from_bytes(data)
.map_err(|e| ProtocolError::DeserializationFailed(e.to_string()))
if data.len() > MAX_CHUNK_SIZE + 1024 * 1024 {
return Err(ProtocolError::DeserializationFailed("Message exceeds size limit".to_string()));
}
postcard::from_bytes(data)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ant_protocol/chunk.rs
Line: 76:77

Comment:
postcard deserialization lacks size limits - a malicious peer can send a crafted message claiming a multi-GB Vec allocation, causing DoS.

Add size check before deserialization:
```suggestion
        if data.len() > MAX_CHUNK_SIZE + 1024 * 1024 {
            return Err(ProtocolError::DeserializationFailed("Message exceeds size limit".to_string()));
        }
        postcard::from_bytes(data)
```

How can I resolve this? If you propose a fix, please make it concise.

dirvine and others added 2 commits February 3, 2026 18:59
- Fix if_not_else in quantum.rs (swap condition branches)
- Fix manual_let_else in handler.rs (use let-else pattern)
- Fix doc_markdown in mod.rs (add backticks around GraphEntry)
- Fix needless_continue in chunk_protocol.rs (remove redundant continue)
- Apply cargo fmt formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 3, 2026 20:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pub fn root_dir(&self) -> &Path {
&self.config.root_dir
}

Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function lacks documentation. Add a doc comment explaining that it recursively counts .chunk files in the directory tree, as this is important for understanding the storage initialization behavior.

Suggested change
/// Recursively count `.chunk` files in the given directory tree.
///
/// This is used during storage initialization to determine how many
/// chunk files are already persisted on disk, by traversing all
/// subdirectories under `dir` and counting files with the `.chunk`
/// extension.

Copilot uses AI. Check for mistakes.
The bincode-to-postcard migration removed MAX_WIRE_MESSAGE_SIZE (5MB),
which was a deliberate safety measure against unbounded allocation from
malicious payloads. Restore the check in ChunkMessage::decode() before
calling postcard::from_bytes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mickvandijke mickvandijke merged commit 780aff6 into main Feb 9, 2026
8 of 12 checks passed
@mickvandijke mickvandijke deleted the pr4 branch February 9, 2026 10:05
mickvandijke added a commit that referenced this pull request Apr 1, 2026
Add unit and e2e tests covering the remaining Section 18 scenarios:

Unit tests (32 new):
- Quorum: #4 fail→abandoned, #16 timeout→inconclusive, #27 single-round
  dual-evidence, #28 dynamic threshold undersized, #33 batched per-key,
  #34 partial response unresolved, #42 quorum-derived paid-list auth
- Admission: #5 unauthorized peer, #7 out-of-range rejected
- Config: #18 invalid config rejected, #26 dynamic paid threshold
- Scheduling: #8 dedup safety, #8 replica/paid collapse
- Neighbor sync: #35 round-robin cooldown skip, #36 cycle completion,
  #38 snapshot stability mid-join, #39 unreachable removal + slot fill,
  #40 cooldown peer removed, #41 cycle termination guarantee,
  consecutive rounds, cycle preserves sync times
- Pruning: #50 hysteresis prevents premature delete, #51 timestamp reset
  on heal, #52 paid/record timestamps independent, #23 entry removal
- Audit: #19/#53 partial failure mixed responsibility, #54 all pass,
  #55 empty failure discard, #56 repair opportunity filter,
  response count validation, digest uses full record bytes
- Types: #13 bootstrap drain, repair opportunity edge cases,
  terminal state variants
- Bootstrap claims: #46 first-seen recorded, #49 cleared on normal

E2e tests (4 new):
- #2 fresh offer with empty PoP rejected
- #5/#37 neighbor sync request returns response
- #11 audit challenge multi-key (present + absent)
- Fetch not-found for non-existent key

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mickvandijke added a commit that referenced this pull request Apr 1, 2026
Add unit and e2e tests covering the remaining Section 18 scenarios:

Unit tests (32 new):
- Quorum: #4 fail→abandoned, #16 timeout→inconclusive, #27 single-round
  dual-evidence, #28 dynamic threshold undersized, #33 batched per-key,
  #34 partial response unresolved, #42 quorum-derived paid-list auth
- Admission: #5 unauthorized peer, #7 out-of-range rejected
- Config: #18 invalid config rejected, #26 dynamic paid threshold
- Scheduling: #8 dedup safety, #8 replica/paid collapse
- Neighbor sync: #35 round-robin cooldown skip, #36 cycle completion,
  #38 snapshot stability mid-join, #39 unreachable removal + slot fill,
  #40 cooldown peer removed, #41 cycle termination guarantee,
  consecutive rounds, cycle preserves sync times
- Pruning: #50 hysteresis prevents premature delete, #51 timestamp reset
  on heal, #52 paid/record timestamps independent, #23 entry removal
- Audit: #19/#53 partial failure mixed responsibility, #54 all pass,
  #55 empty failure discard, #56 repair opportunity filter,
  response count validation, digest uses full record bytes
- Types: #13 bootstrap drain, repair opportunity edge cases,
  terminal state variants
- Bootstrap claims: #46 first-seen recorded, #49 cleared on normal

E2e tests (4 new):
- #2 fresh offer with empty PoP rejected
- #5/#37 neighbor sync request returns response
- #11 audit challenge multi-key (present + absent)
- Fetch not-found for non-existent key

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants